Prepared for submission to JCAP 



The rates and time-delay distribution 
of multiply imaged supernovae behind 
lensing clusters 

Xue Li,"'^ Jens Hjorth" and Johan Richard^ 

"Dark Cosmology Centre, Niels Bohr Institute, University of Copenhagen, 

Juliane Maries Vej 30, 2100, Copenhagen, Denmark 
*CRAL, Observatoire de Lyon, Universite Lyon 1, 

9 Avenue Ch. Andre, 69561 Saint Genis Laval Cedex, France 

E-mail: lixue@dark-cosmology.dk 

Abstract. Time delays of gravitationally lensed sources can be used to constrain the mass 
model of a deflector and determine cosmological parameters. We here present an analysis 
of the time-delay distribution of multiply imaged sources behind 17 strong lensing galaxy 
clusters with well-calibrated mass models. We find that for time delays less than 1000 
days, at z = 3.0, their logarithmic probability distribution functions are well represented 

by P(logAt) = 5.3 x IQ-^At^/M^^Q, with ^ = 0.77, where M250 is the projected cluster 
mass inside 250 kpc (in IO^'^Mq), and /3 is the power-law slope of the distribution. The 
resultant probability distribution function enables us to estimate the time-delay distribution 
in a lensing cluster of known mass. For a cluster with M250 = 2 x IO^^Mq, the fraction of time 
delays less than 1000 days is approximately 3%. Taking Abell 1689 as an example, its dark 
halo and brightest galaxies, with central velocity dispersions a ^ 500 km s~^, mainly produce 
large time delays, while galaxy-scale mass clumps are responsible for generating smaller time 
delays. We estimate the probability of observing multiple images of a supernova in the known 
images of Abell 1689. A two-component model of estimating the supernova rate is applied 
in this work. For a magnitude threshold of rriAB = 26.5, the yearly rate of Type la (core- 
collapse) supernovae with time delays less than 1000 days is 0.004 ± 0.002 (0.029 ± 0.001). 
If the magnitude threshold is lowered to m-AB ~ 27.0, the rate of core-collapse supernovae 
suitable for time delay observation is 0.044 it 0.015 per year. 

Keywords: gravitational lensing, galaxy clusters, supernova type la - standard candles, 
core-collapse supernovas 
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1 Introduction 

An object in our universe, such as a galaxy or a galaxy cluster, could bend light rays passing 
it and act as a lens to magnify or demagnify sources behind it [1]. This effect is known 
as gravitational lensing and has been developed into a powerful cosmological tool in recent 
decades [2] [3] [4] [5]. With the help of gravitational lensing, we can observe distant galaxies 
behind galaxy clusters which would otherwise be too faint to be observed, and analyze their 
properties. We can also measure the cosmological parameters that describe the geometry 
and the expansion rate of the universe [6] [7]. In addition, we can analyze the total mass 
distribution in lensing galaxy clusters [8] , regardless of the differences between luminous and 
dark matter [9]. 

In strong gravitational lensing systems, multiple images are produced. Light travels 
along the stationary paths between two points in space time. A massive object, like a galaxy 
or cluster of galaxies located along the light path, in general affects and perturbs the light 
trajectory [10]. When lensed by a galaxy or a cluster of galaxies, light emitted by a source 
may travel along different light paths and be observed as different multiple images. Light 
from these images is received at different times. Thus, multiple images have different arrival 
times to the observer. The difference of arrival times between multiple images of the same 
source is called the time delay. 

So far, time delays have been studied and applied in many ways: to constrain the Hubble 
parameter Hq [11] [12] [13]; to study the galaxy mass profile with Monte Carlo simulations 
[14]; to measure the cosmological parameter w [15] [16] [17] [18] [13]; to improve the mass 
models of galaxies with the time delays of quasars [19]. 

Compared to quasars, light curves of type la supernovae (SNe la) evolve regularly with 
time, and have been extensively studied [20] [21]. Hence, they are potentially very useful 
as standard sources for constraining time delays in gravitational lens systems. In principle, 
they can also be used to constrain the Hubble constant through the measurement of the 
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time delays [22]. SNe la play a key role as standard candles in distance measurements on 
cosmological scales. Supernovae provide direct evidence that the low-redshift universe is 
accelerating [23] [24] [25]. They act as the primary sources of heavy elements and potentially 
dust in the universe [26]. However, there is still debate on the progenitor models of SNe 
la [21] [27]. There are two major progenitor scenarios to explain the mechanism of Type 
la progenitor. In the single degenerate model [28], a carbon-oxygen white dwarf accretes 
mass from a companion star, a subgiant, a helium star or a red giant, and reaches the 
Chandrasekhar mass limit, resulting in a thermonuclear explosion [29] [30]. In the double 
degenerate scenario [31], two white dwarfs merge, approach the Chandrasekhar mass limit, 
and ignite. Recently, a third model gives another explanation of the possible SN progenitor 
scenarios. Instead of accreting the mass to the Chandrasekhar mass limit, the detonation 
ignites to the accreted He shell of one white dwarf, then the detonation shock wave comes to 
the core or near the center, and a second detonation happens [32] [33] [34]. 

The rate of supernovae (SNR) reflect their formation mechanism. For example, core- 
collapse SNe (Type II and Ibc supernovae) arising from massive stars help us trace the star 
formation and may be used to constrain the star formation rate (SFR) [35]. A well established 
model of estimating the SNR for la SNe (SNRja) is a "two-component" model [36] [37] [38] , 
with one component dependent on the recent star formation in the host galaxy and the other 
component dependent on the host stellar mass. The SN type la rate is a combination of 
these two components. The SNRia at intermediate redshift has been constrained using the 
SDSS-II dataset to z ^ 0.12 [39], and extended analysis to z < 0.3 [40]. At high redshift, 
the SN la rate is also tested and constrained, using the SNLS dataset to z « 0.5 [41] and 
0.1 < z < 1.1 [42], the HST/GOOBS survey up to z < 1.8 [43] [44], and the Subaru Deep 
Field (SDF) to z < 2 [45]. The core-collapse supernova rate is also tested and estimated up 
to z ~ 0.7, using the GOODS survey. With these data and the SNR model, we can estimate 
the lensed SNR in a cluster lensing system [46]. 

The aim of this paper is to (1) develop a general function to describe the time-delay 
distributions in gravitational lensing systems; (2) estimate the lensed SNR as a function of 
time delay and magnitude in Abell 1689 to assess the feasibility of constraining mass models 
and cosmological parameters with lensed supernovae observationally. 

The outline of the paper is as follows. In section 2, we develop a theoretical formalism for 
describing the time-delay distribution. The analysis and discussion of parameters for the dis- 
tribution function are developed in section 3. In section 4, we model time-delay distributions 
of 17 massive lensing clusters. We analyze and fit the parameters of the distribution function, 
based on the results from modeling. In section 5, using the "two-component" model, we cal- 
culate the probability of observing supernovae in 35 known multiply imaged galaxies behind 
Abell 1689. Finally, we summarize our investigation and discuss future prospects in section 6. 
Throughout this paper, we assume a cosmological model with Q,m = 0.3, = 0.7, h = 0.7. 
Magnitudes are in the AB system. 

2 Time delay theory 

A light ray is deflected when it passes a cosmic massive object. In a lensing system, the 
light path from the source to the observer is changed according to the gravitational field near 
the lens. In the case of a multiple image system, lensing generally causes a difference in the 
arrival time of a galaxy image pair and hence generates a time delay. The time delay. At, 
can be calculated as [10] 
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cAtiy) = go' (1 + ZLM{xi,y) - </.(x2,y)], (2.1) 

where go = 47r(^)^ '^j^^'"^ is the characteristic length scale in the lens plane. Here cr„ is the 
value of an effective velocity dispersion, Dqs is the angular diameter distance between the 
observer and the source, Dql is the angular diameter distance between the observer and the 
lens, Dls is the angular diameter distance between the lens and the source, and zl is the 
lens redshift. We denote the image position in the lens plane by ^ and the source position 
in the source plane by rj. Here y = \t]\/V0j is the source position in the source plane, with 
rjo = being the maximal distance to the caustic line. We define Xi{i = 1,2) as two 

image positions in the lens plane with x = |^|/go- Here x,y are dimensionless vectors. The 
Fermat potential is defined as (/){xi,y){i = 1,2). For a two-image system, the larger the 
difference of their Fermat potentials, the larger time delay they will generate. The Fermat 
potential (j){x,y) can also be described by the lensing potential if{x) [47], 

0(x,y) = ^^^-(p(x). (2.2) 

We assume spherically symmetric lenses in what follows. 

Generally, for a lens with density distribution p cc r^^ , the time delay can be expanded 
as [8] [48] 

(2-5)2 fAr' 



At{S) ^{6- l)Atsis 



1 



12 y{r) 



(2.3) 



where Aisig represents the time delay for the Singular Isothermal Sphere (SIS; 6 = 2), and 
(r) = (rj + rj)/2. If the term ^ is small, the higher order terms can be ignored. 

For real clusters, tidal perturbations from objects near the lens or along the line of sight 
[49] [50] may affect the images as well. An external shear can be added to the lens [48] , whose 
potential is 

Hi) = -^[71 (C? - Cl) + 272C1C2], (2.4) 

where 7 is the strength of the shear, 71 = 7 cos 20^ and 72 = 7sin20-y, with 6-y being the 
angle between the direction of the shear and the major axis of the lens. Here Ci, C2 represent 
the components of the lens coordinate. Then more than 2 multiple images may be produced 
by each source, and the time delay between images i and j is [48] 

cAt = ^^j^^°' (l + ZL){ir] - rl) + 7[r| cos 2{e, - 9,) - cos 2(6, - 6^)]}, (2.5) 

where = (C| + C|)^^' with k = i,j is the distance of the image from the center. The shear 
affects the time delay in two-image lenses with 7^ rj only slightly, while in four-image 
lenses with rj ~ rj, the shear may play a significant role. 

For a lens in a general quadrupole with total shear F = 'jint + "Jext, the time delay can 
be estimated as [8] [51] 

, sin2(A6'i,72) , , 

where {k) is the average surface density in the annulus bounded by the images in units of 
the critical surface density and fint = Jint/'^ is the fraction of the quadrupole. Here A6ij 
represents the angle between the images A9ij = 9i — 02- 
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To discuss the time-delay distribution in a lensing cluster, we start from a simple sit- 
uation, in which the mass distribution of a lensing cluster is described by a SIS profile 
and no more than two multiple images are generated from each source. With the density 

2 

P{r) = 4^G^^ the time delay is [52] 

cAtsis = 32vr2(^)'^f^(l + z.)y, (2.7) 
V c / Dos 

from which it follows that At oc y, i.e., the time delay is proportional to the location of the 
source. If the source is located within the strong lensing area enclosed by the caustic line, 
two images will be created. In this case, the normalized probability that the source is located 
between rj and rj + drj is [53] 

i-ri+dri ^ 

p{r],r] + dr]) = 2^dr/; ?? ^ r/Q. (2.8) 

Hence, the normalized time delay probability distribution function can be simplified as 



/(At) = 2--^; At ^ Atpeak. (2.9) 

peak 

Here f{x) denotes a probability distribution function throughout this paper and Atpeak is 
the time delay which has the largest probability. With /(At)d(At) = P(log At)d(log At), 
and d(At) = InlO • 10'°s^*(i(log At), it follows that P(log At) = InlO • 10'°g^*/(At), i.e., the 
probability distribution function of the logarithmic time delay for the SIS profile is 

P(logAt) = i^l02i°g^*; log At ^ log Atpeak. (2.10) 

^'•pcak 

The time-delay distribution is sensitive to the slope of the density profile [54] [55]. Steeper 
inner slopes tend to produce larger time delays [52]. There are several theoretical profiles 
established to describe the mass distribution of a cluster: the SIS profile, the dPIE profile 
(dual pseudo isothermal elliptical profile) [56], and the NFW profile (Navarro- Frenk- White 
profile) [57], etc. For a density distribution described as, p cc r~^, the density slopes 6 for 
these three profiles are listed in table 1. The slope of the density profile may affect the 
distribution of the time delay in this way: compared to the SIS profile, the NFW profile has 
shallower inner density slope, but steeper outer density slope, which means that on small time 
delay scales, the distribution of the time delay will be stretched out to a higher probability, 
but on large time-delay scales, it will be lower than the distribution of a SIS profile. The 
dPIE profile has even shallower inner density slope but steeper outer density slope than the 
NFW profile, so the time-delay distribution of dPIE profile has the shallowest slope among 
the three profiles. Equations (2.9) (2.10) show that the time-delay probability distribution 
function is a power law for the SIS profile. Motivated by the discussion above, we make 
the simple ansatz that the time-delay probability distribution function can be approximated 
as: /(At) oc At^^"^ and P(logAt) oc l0/''°s^*, or P(logAt) oc At^. More details on the 
comparison of the ansatz and simulation results are presented in section 3.1. 

This ansatz is based on the assumption of the time-delay distribution generated by a two- 
multiple-image system. For clusters with complicated structures, as shown in equations (2.3), 
(2.5), (2.6), e.g., clusters whose mass distributions are described by clumps of potentials, the 
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inner 5 


intermediate 5 


outer 5 


/3 


SIS 


-2 


-2 


-2 


2.00 


NFW 


-1 


-2 


-3 


1.20 


dPIE 





-2 


-4 


0.83 



Table 1. The slopes of the density profiles describing the mass distribution of a galaxy cluster and 
the slopes of the best-fitting functions to the time-delay distribution. More details of the best-fitting 
function and slope P are described in subsection 3.1. 



situation is more complicated because more images are produced so more time delays are 
generated. Fortunately, the simulations described in section 3.2 and figure 2 show that the 
distribution of time delays generated by multiple images still obey a power-law distribution. 
After normalization, the probability distribution functions can be written as: 

/(At) = -|— At^-i; At ^ Atpeak, (2.11) 

peak 

P(logAt) = ^^At^; At ^ Atpeak, (2.12) 

^*peak 

log P(log At) = /3 log(At) + log(/3 In 10) - /3 log Atpeak; At ^ Atpeak. (2.13) 
We can also write the cumulative probability distribution as 

/■^* f -^f- At ^ Atpeak, 

/(< At) = / f{x)dx = { (2.14) 

Jo [ 1 At > Atpeak. 

If we set /3 = 2, the probability distribution functions (2.11) and (2.12) reduce to the expres- 
sions for the SIS case, i.e., functions (2.9) and (2.10). 

According to the virial theorem [58], for a fixed radius, the enclosed mass is proportional 
to the velocity dispersion, M oc a^. From equation (2.7), for the SIS profile, Atpeak, sis °^ 
o"^ oc M^. More generally, for a general profile with density distribution described as p cc , 
the expression for the time delay can be well approximated as At oc Atsis('^ — 1) (2.3), i.e.. 
At a ((^ - l)M^. Hence, we can write the probability distribution function (2.13) as 

log P(log At) = /3 log(At) + log(/3^i In 10) - 2/3 log M250 + ^2, (2.15) 

where Ci,C2 are constants to be determined. Here M250 is defined as the projected mass 
within R < 250 kpc, in units of 10^^ Mq. In this equation, /3 and M250 are parameters. If the 
value of /3 is fixed, on the right hand of the equation (2.15), the second and the fourth terms 
(i.e., \og{P^'^ In 10) and C2) will be reduced to one constant: C'2 = C2 + log(/3^^ In 10). In this 
case, there will be only one constant C'2 to be determined. So the logarithmic probability 
distribution function can be reduced to 

logP(logAt) = /31og(At) - 2/31ogM25o + C;. (2.16) 

We will discuss j3 and C'2 in section 3. 
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Figure 1. The logarithmic probabihty distribution of the time delays for three density profiles. 
They are the SIS profile (black), the dPIE density profile (red) and the NFW density profile (blue). 
The background sources are at z = 3. The open circles and their poisson error bars are obtained 
from a numerical representation of the potentials. The best-fitting functions for each cluster are also 
plotted. The slopes of the best-fitting functions for the SIS profile, the NFW profile and the dPIE 
profile are ,3sis = 2.0, /3nfw = l-H, Pdpm = 0.78. 

3 Modeling clusters of galaxies 

In this section, we discuss how /3 depends on different mass profiles, and the effect of the 
mass on the time-delay distribution. Considering the time required for a realistic observation, 
small time delays, i.e., time delays no longer than 1000 days, are more suitable for an actual 
time-delay measurement in a reasonable amount of time. In this paper, we therefore focus 
on time delays less than 1000 days. We treat all time delays as independent from each other 
and the time delays are taken from all images pairs. 

To get the time-delay distribution, we create an input catalog of sources. Time delays 
are created when there are multiple images, so we need to make sure that the input source 
plane covers the area enclosed by the caustic line(s) and includes all potential multiply lensed 
sources. On the other hand, the input source plane should be sufficiently well sampled so 
as to be sensitive to the mass distribution and potential differences in the lens. This is to 
make sure that small time delays are also produced. In this work, we choose an input source 
plane covering an area of 60 x 60 arcsec with 200 x 200 pixels in the source plane at z = 3. 
With the help of Lenstool [59], we can obtain a list of all images of every source in the source 
plane. Then we compute the differences in the arrival times between multiple images from 
each source. For example, for a source with 3 multiple images: imagel, image2, image3, we 
compute the differences of arrival times between each two of these three images. Then we 
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get 3 time delays in total. 

3.1 The slope of the time-delay distribution 

The time-delay distributions for the SIS, the NFW and the dPIE profiles are simulated 
and computed. In simulating the time-delay distributions, we choose parameters similar to 
those of mass models of real clusters. We adjust the parameters of the profiles to make 
their distributions overlap. The results are shown in figure 1. In the NFW mass model, 
the velocity dispersion is 1810 km/s, the concentration is 3.579 and the scale radius is 618 
kpc. In the dPIE mass model, the velocity dispersion is 1500 km/s, the core radius is 72 
kpc, and the cut radius is 2000 kpc. For the mass model described by the SIS profile, we 
set the velocity dispersion to 400 km/s. A change in the velocity dispersion will shift the 
distribution horizontally but keep the slope unchanged. So for computational convenience, 
we keep the velocity dispersion and shift the distribution horizontally to make the results 
for three profiles overlap. All three profiles are circular. With time delays less than 1000 
days, the slopes of the logarithmic probability distribution functions for the SIS profile, the 
NFW profile and the dPIE profile, are /3sis = 2.0 as expected, /3nfw = 1-11, /3dPiE = 0.78, 
respectively (see also table 1). The difference in slopes of the distributions arises from the 
diff'erence in the density slopes of the mass profiles, as discussed in section 2. This shows 
that the slope of the time-delay distribution is strongly affected by the mass distribution, 
especially the density slope of the cluster. 

3.2 An example: Abell 1689 

With the help of deep Hubble Space Telescope (HST) imaging, Abell 1689 {z = 0.183) dis- 
plays a large number of multiple image systems at the center of the cluster. Using information 
from these systems, the mass model of Abell 1689 has been extensively studied. We base our 
work on the mass model consisting of 35 plausible lensed sources and 116 multiple images 
[60] [61] [7]. Among them, 25 sources have confirmed spectroscopic redshifts, the other 10 
systems have lensing modeling redshifts or/and photometric redshifts. 

To check how the time-delay distribution depends on the mass distribution, we split the 
mass model of Abell 1689 into two parts. The first part (massive components) consists of a 
dark halo and the three brightest galaxies with central velocity dispersions a ^ 500kms~^. 
The second part includes all other substructures (subcomponent), that is, all other grav- 
itational potentials with velocity dispersions a smaller than 500kms~^. We compute the 
time-delay distribution in two different ways: First, we produce all time delays for Abell 
1689 with the full mass model. Second, we compute time delays of Abell 1689 with only 
the mass model of the first part (massive components). The second part (subcomponent) 
itself produces only two multiple images, i.e., only one time delay. This is because most 
substructures do not gain enough mass to surpass the critical surface mass density, which is 
required for the structure to produce multiple images. 

The time-delay distribution for Abell 1689 is shown in figure 2. The histogram in green 
represents the distribution of the time delays of the 35 known sources. It is consistent with 
the simulation of the grid of hypothetical sources. A logarithmic probability distribution 
function (2.13) with slope (3 = 0.77 in red is also plotted. The blue curve represents the 
time delay distribution generated by the first part (massive components). From the figure, it 
is evident that the first part (massive components) succeeds in reproducing most relatively 
'large' time delays, e.g., time delays larger than around 30 days. We conclude that small 
time delays are predominantly generated by substructures in the mass model. 
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logAt (day) 

Figure 2. The time-delay distributions of Abell 1689 with modeled background sources at z = 3. 
The time-delay distribution produced by the cluster with the full mass model is in black. The blue 
curve represents the time-delay distribution produced by the dark halo and a few bright galaxies 
(massive components), which fails to reproduce small time delays. The distribution of the selected 
longest time delays from each source is in purple. The red curve represents the logarithmic probability 
distribution function with j3 = 0.77. The histogram in green represents the distribution of the time 
delays of 35 known multiply imaged sources. We are mainly interested in small time delays, i.e., those 
located to the left of the dotted orange line at 1000 days (log At — 3.0). The dashed line in olive 
represents Atpcak- 

The typical offset between the observed and modeled images is about ~ 1" [5] . Therefore, 
the individual time delays are affected. To test how the position offset affects the time-delay 
distribution, we plot the time delay distributions generated by two different mass models 
for Abell 1689. The result is shown in figure 2. The black curve represents the time-delay 
distribution applied in this paper, while the dotted curve is the result when another mass 
model [62] is used. Though different models may produce different time delays, the slope of 
the distribution of the time delays is not strongly affected. 

The time-delay distribution of a real cluster may be more complicated than the ones 
discussed in section 2. The structure of the lensing cluster may consist of more clumps of 
potentials with more complicated mass distribution, thus each source may produce more than 
2 multiple images. To check how these extra multiple images affect the time-delay distribution 
and whether the distributions for real clusters can also be fitted with the power-law functions, 
we also plot the distribution of the longest time delays generated by each source in figure 
2. The figure shows firstly that compared to the 'longest' time delays, the total time-delay 
distribution can be fitted to a power-law function. Secondly, for small time delays, though 
none of them belong to the 'longest' time delays and there is no direct connection between the 
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Cluster 




z 


RA (J2000.0) 

[dee] 

L oj 


Dec (J2000.0) 

[deg] 


^250 

(lO^^Mo) 




/9 

1 






Abell 2204 





152 


248.195540 


5 


575825 


2 


29± 


50 





79 ± 





03 


Abell 868 





154 


146.359960 


-8 


651994 


1 


97 ± 1 


11 





57 ± 





03 


RXJ 1720 





164 


260.041860 


26 


625627 


1 


18 ± 


59 





64 ± 





04 


Abell 2218 





171 


248.954604 


66 


212242 


3 


00 ± 


24 





89 ± 





04 


Abell 1689 





183 


197.872954 


-1 


341006 


4 


53 ± 


13 





79 ± 





03 


Abell 383 





188 


42.014079 


-3 


529040 


1 


87 ± 


26 





85 ± 





09 


Abell 773 





217 


139.472660 


51 


727024 


3 


01 ± 


58 





86 ± 





04 


RXJ 2129 





235 


322.416510 





089227 


1 


37± 


37 





74 ± 





03 


Abell 1835 





253 


210.258650 


2 


878470 


2 


83 ±0 


41 





89 ± 





05 


Abell 1703 





280 


198.771971 


51 


817494 


2 


98 ±0 


09 





77 ± 





08 


MACS 2135 





325 


323.800390 


-1 


049624 


2 


64 ±0 


04 





83 ± 





02 


MACS 1319 





328 


200.034880 


70 


077501 


2 


28 ±0 


26 





41 ± 





06 


MACS 0712 





328 


108.085460 


59 


538994 


1 


29 ±0 


27 





63 ± 





05 


MACS 0947 





345 


146.803230 


76 


387101 


2 


96 ±0 


94 





59 ± 





09 


SMACS 2248 





348 


342.183260 


-44 


530966 


2 


87 ±0 


06 





80 ± 





06 


MACS 1133 





389 


173.304880 


50 


144436 


1 


52 ±0 


23 





91 ± 





05 


MACS 1347 





451 


206.877570 


-11 


752643 


3 


86 ±0 


02 





77 ± 





05 



Table 2. Properties of 17 lensing clusters. The redshifts of the selected clusters range from 0.15 
to 0.30 for models from LoCuSS and from 0.30 to 0.45 for MACS clusters. Here M250 denotes the 
projected lensing mass inside 250 kpc [63] [62]. The slopes of /3 in function (2.16) are also listed. 

ansatz motivated by two image systems and those non-longest time delays, the distribution 
of the small time delays may also be fitted with a power-law function, which is implied by 
the simulation result of the total time delays. Finally, even though more multiple images are 
generated and more time delays are produced, the time-delay distribution still can be fitted 
well with a power-law function. Therefore, we proceed to apply the ansatz of the power-law 
functions (2.11) (2.12) deduced from two-image system to the real clusters with more multiple 
images. 

4 Time delays in 17 clusters 

4.1 Cluster selection and modeling 

To further analyze the logarithmic probability distribution function of time delays and con- 
strain the parameter and the constant in equation (2.13), we compute time-delay distribu- 
tions by modeling 17 lensing clusters. The cluster selection procedure is based on two criteria: 
First, each lensing cluster system should have at least one image with a spectroscopically- 
confirmed redshift. Furthermore, the range of redshifts of the selected clusters should be as 
large as possible. The redshifts of the selected clusters range from 0.15 to 0.30 for models 
from LoCuSS [63] and from 0.30 to 0.45 for MACS models [62]. The selected clusters are 
listed in table 2. 

The modeling procedure is the same as described in section 3. To make a reasonable 
comparison, the input source file for each cluster should have the same number of sources. 
Moreover, the whole source area should have the same size and be sufficiently sampled to 
cover the multiple image areas described by the caustic lines. 
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Figure 3. The distribution of parameter /? for probability distribution functions of 17 clusters. The 
modeling sources are at 2; = 3. The probability distribution functions are fitted to the modeling data 
of time delays within 1000 days. The parameter range is /? G [0.41, 0.91]. 



4.2 Estimating (3 

As for Abell 1689 (section 3.2), we fit power-law distribution functions to time-delay dis- 
tributions generated from the 17 cluster models. The functions are fitted to the data with 
time delays less than 1000 days. The cluster masses [61] [63] and the fitting values of /3 
are shown in table 2. The distribution of the parameter /3 is shown in figure 3. The mean 
value is ^ = 0.75, and the median value is /3 = 0.79. With the least squares method, if 
the clusters are weighted equally to each other, we determine that the best-fitting slope is 
j3 = 0.77 for the 17 clusters, with standard deviations in logP(At) in the range [0.11, 0.30]. 
As a consequence, to a good approximation, /3 can be fixed in equation (2.16). 

4.3 Parameter estimation 

We fix the value of /3 = 0.77 in the logarithmic probability distribution function (2.16) and fit 
the function to the simulated data, and then find the best-fitting value for constant C2. With 
smallest deviations, we find that the best-fitting value is C2 = —3.28. So the logarithmic 
probability distribution function (2.16) can be written as 



log P(log At) = /3 log At - 2/3 log M250 



3.28 



(4.1) 



or 



P(log At) = 5.27 X lO^'^At'^/M; 



(4.2) 
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Figure 4. The logarithmic probabiHty distribution functions for 17 clusters with background sources 
at z = 3. The solid curves represent the modeling results of time-delay distribution of 17 clusters. 
The red shaded regions represent the logarithmic probability distribution function from equation 
(4.3). The uncertainties of the distribution functions arise from the uncertainties in the mass. For 
MACS1133, the input source area is 10 x 10 arcsec. For other clusters, the input source areas are 60 
X 60 arcsec. 



If we introduce /3 = 0.77, then the logarithmic probability distribution function is simply 

logP(log At) = 0.77 log At - 1.54 log M250 - 3.28. (4.3) 

For a cluster with M250 = 2 x 10^^ M©, the probability of a time delay less than 1000 days is 
about 0.025. In figure 4, we present modeled time-delay distributions for all 17 clusters and 
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their logarithmic probability distribution functions (4.3) with sources at z = 3. Among the 
17 systems, MACS1133 has much smaller multiple-image area enclosed by the caustic lines. 
To make the small time delays detectable, for MACS1133, we change the input source area 
to 10 X 10 arcsec. For other clusters, we keep the input source as 60 x 60 arcsec. 



5 The rate of lensed supernovae in Abell 1689 

With the help of the gravitational lensing amplification, we can potentially observe super- 
novae which would otherwise be too faint for detection. If a source is located inside the 
multiply imaged surface defined by the caustic line, two or more images will be generated. 
We may therefore observe multiple images of a supernova in the source galaxies. 

For Abell 1689, we know 35 multiply imaged sources and 116 corresponding multiple 
images. For each multiple-image pair producing a time delay, we call the image arriving 
first to the observer the leading image. For example, for a source with 3 multiple images: 
imagel, image2, image3, assuming the images have arrival times Timagei < Timage2 < Timagea.-, 
there are three image pairs: pairl2, pairl3 and pair23. The corresponding leading images are 
imagel, imagel and image2, accordingly. So in this three-image system, only the image with 
the longest arrival time (in this case, it is imageS) cannot be the leading image. That is, for 1 
source having 3 multiple images, the number of the leading images are 3 — 1 = 2. Therefore, 
in a lensing system with m sources and n corresponding multiple images, the number of the 
leading images are n — m. According to this definition, in Abell 1689, there are (116 — 35 =) 
81 leading images in total. 

To estimate the probability of observing a leading supernovae image in Abell 1689, we 
need to know the supernova rate. The models for describing the supernova rate are dependent 
on the types of the supernova. The rate of core-collapse supernovae (SNRcc) can be obtained 
from the star- formation rate: 

^-lO-f^l. (5.) 



yr 1 J \Mq yr- 

where the parameter k^c can be determined by measuring the SNRcc and SFR. Here the 
factor of 10^'^ is multiplied into the function to simplify the parameter k^c- We also multiply 
factors in the following functions (5.2) (5.7) for the same reason. The SNRcc and SFR can 
be derived from observational data [43] [64] . By using the core-collapse SN rate density and 
comparing it against the SFR density, the parameter is constrained to be kcc = 7.5 it 2.5 [36]. 
Alternatively, by using a Salpeter IMF and a progenitor mass ranging between 8 and 50 solar 
masses, the parameter is estimated to be kcc = 7 [46]. In this paper, we choose kcc = 10 as 
the upper limit, and kcc = 5 as the lower limit. 

For Type la supernovae, we use the popular "two-component" model to estimate the 
Type la supernova rate (SNRia) [36] [37] [38]: 

/^SNRiaA J _-in^M*A" , .r.-i( SFR 



yr-i ; VMo; VMoyr-^ ' ' ^ ' 



where is the host stellar mass, and a denotes the exponent of the stellar mass. The first 
component describes the stellar mass contribution. The second component describes the host 
galaxy star-formation contribution. For parameters A and a, we choose A = 1.05 it 0.16 and 
a = 0.68 lb 0.01 [65]. The parameter B can be related to the SNRcc — SFR relation (5.1), 

B = kccO, (5.3) 
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Figure 5. The cumulative rate (A'sn) of the leading images of lensed supernovae in Abell 1689, 
derived from equations (5.1) and (5.7). The magnitude threshold is 26.5 and a total number of 70 
among the 72 leading images are included because of the magnitude threshold. The time delays are 
calculated based on the modeled source positions. The red curve represents the estimated cumulative 
SNRia, while the cumulative SNRcc is plotted in blue curve. The uncertainties of -/Vsn arise from the 
upper and lower limits of the parameters in the functions (5.1) (5.2) (5.7). 



where Q = SNRja/SNRcc. The value of has been estimated at redshift up to z ~ 1.5 [43]. 
At redshift z < 1, the ratio of SNRia/SNRcc approximately ranges between 6 = 1/2 and 
O = 1/4, which is consistent with the result Q = 0.35 ± 0.08 in nearby galaxies [37]. At 
higher redshift z > 1.0, inspired from figure 3 of [43], we assume 

6 = :^:; 1.0 <z. (5.4) 
15 

Considering all sources in Abell 1689 have redshifts z > 1.0, in this work, B = 0.5 it 0.17. 
This value is consistent with B = 0.39 it 0.07 (based on redshift 0.2 < z < 0.75 [38]). 

From their magnitude in F775W, we estimate the flux and the luminosity and then 
constrain their SFR [66] as 

This conversion between UV flux and the SFR is for rest wavelengths, ranging from 1500 
A to 2800 A, while our data [60] have observed wavelength in the range 6900 A to 8600 A. 
We assume these galaxies have flat spectra, as is typical for star-forming galaxies, so we can 
calculate the luminosity from the flux. 
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Note that multiple images from the same source should have the same inferred luminos- 
ity. To infer the luminosity of each image, we need to know their fluxes. With magnitudes of 
116 images [60] [61], we can estimate their fluxes. Using Lenstool, we can get a magnification 
map on the image plane then read the values of the magnification on the map. Considering 
the gravitational magnification effect and k correction [67] [68], the flux can be calculated as 



where // is the gravitational magnification factor, and z is the source redshift. When images 
are located close to the caustics, their gravitational magnification factors (/i) may be very 
large. Thus, the values of fluxes of these images derived from equation 5.6 are uncertain, and 
their luminosities estimated from the fluxes may not be reliable. So we neglect these images 
and average other luminosity values of images of the same source to constrain L,^. When we 
know their luminosities, from equation (5.5), we can calculate their SFR, and then SNRia 
(5.7) and SNR^c (5.1). 

We also need to calculate the stellar masses and the star-formation rates for each galaxy. 
To estimate the stellar-mass contribution to the SNR, we separate the images into two groups. 
In group one, with data of photometry from HST/ ACS in bands B, V, I, Z, ground-based 
near-infrared imaging and Spitzer/IRAC photometry [61], we derive their stellar mass from 
SED fitting [69]. In group two with insufficient photometric data, we estimate the mass 
contribution based on the ratio of mass and star-formation contribution to the SNR derived 
from group one. The median value of this ratio, is only 2.9% (4.2%) of the upper (lower) 
limits of the star-formation part. In this paper, we choose its median value and assume the 
mass part contributes 3.5 % of the star-formation part. Therefore, for group two, the SNRja 
may be estimated as 



We assume Type la supernovae have absolute magnitude M = —19.3 mag [43], and core- 
collapse supernovae have M = —17.0 mag [70]. With their absolute magnitudes, considering 
the k correction, we calculate the apparent magnitudes for each supernova. Their apparent 
magnitudes of a supernova can be expressed as: 



Here Dl denotes the luminosity distance. This equation is used to constrain the magnitude 
when considering the magnitude thresholds in figures 5 and 7. 

The cumulative rate of observing the leading images of the lensed supernovae in Abell 
1689 is shown in figure 5. Among the total 81 leading images, we exclude 14 images from 5 
sources with insufficient magnitude data [60], so there are 72 leading images to be considered. 
Here A'^gN(< log At) represents the cumulative rate of the supernova images observable in one 
year. We set the detection threshold to 26.5 mag in all bands. In that case, a total number 
of 70 of the 72 leading images are included. The resulting probabilities are 0.004 it 0.002 
for the Type la supernovae and 0.029 it 0.001 for the core-collapse supernovae in one year, 
assuming time delays less than 1000 days. 

We also compare our results to the results from other groups [46] . The result is shown in 
figure 6. The difference in supernovae rates may arise from the difference in the lens model 
and/or the SNR prescription. For example, different lens models may produce different 



log(F^) = [mAB + 2.51og(|/i|) + 2.51og(l + z) + 48.6]/(-2.5) 



(5.6) 




(5.7) 



m = M + 51ogio(i^L/10 pc) - 2.51ogio(|/x|) - 2.51ogio(l + z). 
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Figure 6. The comparison of tlie supernova rate estimated from functions (5.1) and (5.7), and the 
supernova rate from [46], with time delays less than 5 years. The green and purple curves represent 
the logarithmic cumulative SNRcc and SNRia from [46]. Our results are shown in blue for SNRcc 
and in red for SNRia. The uncertainties of logNg^ arise from the upper and lower limits of the 
parameters in the functions (5.1) (5.2) (5.7). At log At ^ 1.8, there is a fast increase in the logiVgN. 
Firstly, in this region, there are more leading images included, so both our results and the results 
from [46] raise quickly. Secondly, the leading images have much larger SNR and uncertainties in their 
result. This causes an even faster raise and higher upper limit in their curves. 



magnification factors for each images, and then produce different luminosities, and SFR. In 
addition, the different parameters chosen in the SNR model (5.2), e.g., kcc, A, a, B may 
also affect the constraint on SNR. In this paper, we choose kcc = 7.5 it 2.5, A = 1.05 it 0.16, 
a = 0.68 ± 0.01, ^ = 0.5 ± 0.17, while in [46], the parameters are kcc = 7.0, A = 4.4+}j, 
a = 1.0, B = 2.6ibl.l. Larger values of A, a, B chosen will cause a higher SNR estimate. We 
applied the parameters of [46] to the functions (5.1) (5.2) (5.7), in an attempt to reproduce 
their results. The results fit well to their SNRia, but a significant discrepancies remain for 
SNRcc. 

We estimate the probability of observing the leading supernova images as a function of 
magnitude threshold in figure 7. The figure shows the cumulative rates of observable leading 
supernova images as a function of magnitude threshold, with time delays less than 1000 days. 
For a magnitude threshold of 27.0, we can observe 0.044 it 0.015 core-collapse supernovae per 
year, with time delays less than 1000 days. Under the conditions of small time-delay scales 
and limited magnitude threshold, the probability of observing a leading supernova image is 
quite low. 
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Figure 7. The cumulative rate of observing a leading supernovae image, as a function of magnitude 
threshold. The images have time-delay separations less than 1000 days. The uncertainties of A^sn 
arise from the upper and lower limits of the parameters in the functions (5.1) (5.2) (5.7). The Type la 
supernovae involved are much brighter than those of the core-collapse supernovae. This causes a 'shift' 
in the distributions of the rates of Type la supernovae. At toab around 24.7 for Type la supernovae, 
the curve flattens out. This is because all the Type la supernovae involved in the calculation have 
magnitudes smaller than about 24.7. The HST telescope has the magnitude limit in detecting the 
lensed sources. The James Webb Space Telescope may detect fainter lensed sources behind Abell 
1689. As a sequence, Type la supernovae fainter than 24.7 will be detected, and the cumulative rate 
of observing a leading Type la supernovae {Ng-^) will increase at m^Q > 24.7 as well. This is also the 
case for the core-collapse supernovae when ttt-ab > 27.0. 

6 Summary and discussion 

We analyzed the time-delay distributions in strong lensing systems. We found that we can 
describe the probability distribution of time delays as a power law function (2.16). In the 
function, there are two parameters, -M250, (3, and a constant Cg. Modeling with mass profiles 
of SIS, NFW and dPIE (in figure 1), we found that the parameter /3 is strongly affected by 
the slopes of the mass profiles of the lensing clusters. The shallower the inner density profile 
and the steeper the outer density profile are, the more the time-delay distribution will be 
stretched out to both the higher and the lower end, causing a lower (3. By modeling Abell 
1689, we found that the massive galaxies and halos mainly produce large time delays, while 
small time delays are predominated produced by substructures (galaxies) in the cluster. We 
also simulated and verified that the time-delay distribution generated by 'real' clusters with 
more than 2 multiple images from the same sources also obey the power-law distribution in 
figure 2. 
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To estimate the parameter and the constant in the logarithmic probabihty distribution 
function, we modeled 17 strong lensing clusters as shown in figure 4, using their well calibrated 
mass models. With the fixed best-fitting slope /3 = 0.77, we determined the best-fitting value 
of 6*2 to the function (2.16). The resultant logarithmic probability distribution function (4.3) 
enables us to estimate the time-delay distribution of a cluster with known mass. 

We also calculated the probability of observing the leading images of the lensed super- 
novae in Abell 1689. The SNRcc can be derived from the SFR (5.1). The "two component" 
model was applied to constrain the SNRia. We constrained the parameters in the function 
(5.2), and calculated the SNR for Type la supernova. We estimated the luminosity from 
magnitudes of images in Abell 1689 (5.6), derived the SFR from the luminosity (5.5), and 
then estimated the probability of observing a leading supernova image in the system as shown 
in figure 5. Considering a typical magnitude limit of observations with mAB = 26.5, we can 
observe 0.004 it 0.002 Type la supernovae and 0.029 it 0.001 core-collapse supernovae per 
year. We compared the results in this work to [46] as shown in figure 6, and discussed the 
possible reasons which may cause the differences. 

We also constrained the cumulative rate of observing a leading supernovae image, as a 
function of the magnitude threshold (in figure 7). If the magnitude limit is lowered to 27.0, 
the probability of observing the leading images of the core-collapse supernovae will be up to 
0.044 lb 0.015 per year, with image separations within 1000 days. This probability is quite 
low, which means that detecting time delays from lensed supernovae will be challenging with 
current facilities. 
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